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Abstract 

Drosophilid fruit flies have provided science with striking cases of behavioral adaptation and genetic innovation. A recent example is 
the invasive pest Drosophila suzukii, which, unlike most other Drosophila, lays eggs and feeds on undamaged, ripening fruits. This not 
only poses a serious threat for fruit cultivation but also offers an interesting model to study evolution of behavioral innovation. We 
developed genome and transcriptome resources for D. suzukii. Coupling analyses of these data with field observations, we propose a 
hypothesis of the origin of its peculiar ecology. Using nuclear and mitochondrial phylogenetic analyses, we confirm its Asian origin and 
reveal a surprising sister relationship between the eugracilis and the melanogaster subgroups. Although the D. suzukii genome is 
comparable in size and repeat content to other Drosophila species, it has the lowest nucleotide substitution rate among the species 
analyzed in this study. This finding is compatible with the overwintering diapause of D. suzukii, which results in a reduced number of 
generations per year compared with its sister species. Genome-scale relaxed clock analyses support a late Miocene origin of D. suzukii, 
concomitant with paleogeological and climatic conditions that suggest an adaptation to temperate montane forests, a hypothesis 
confirmed by field trapping. We propose a causal link between the ecological adaptations of D. suzukii in its native habitat and its 
invasive success in Europe and North America. 
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Introduction 

The genus Drosophila is one of the most studied in virtually all 
fields of biology because of an invaluable combination of re- 
productive (high fecundity and short generation time) and 
ecological (wide range of niches and fast adaptability) traits. 
These features have allowed several Drosophila species to 
expand well outside their ancestral range. A classic example 
is Drosophila melanogaster, whose worldwide distribution is 
the result of an out-of-Africa expansion approximately 1 5,000 
years ago (David and Capy 1988). A more recent example of 



this invasiveness is Drosophila suzukii, which in only a handful 
of years has invaded several Western countries from its orig- 
inal Asian distribution. The global spread of D. melanogaster 
has little economic consequence, but the spread of D. suzukii 
is of significant concern. 

Unlike most of its close relatives, which lay eggs only on 
decaying or rotten fruits, D. suzukii lays eggs and feeds on 
unripe and undamaged fruits (Dreves 2011; Walsh et al. 
2011; Rota-Stabelli, Blaxter, et al. 2013), and consequently, 
this species is quickly becoming an economically significant 
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pest of fruit industries. This difference in ecology is reflected in 
morphological adaptations, such as an enlarged serrated ovi- 
positor (used to break ripening fruits), and must also include 
additional neurological, lifecycle, and physiological adaptations 
to finding, and feeding on, unripe food sources. Drosophila 
suzukii is thus a promising model for the study of the origins 
and bases of behavioral innovation. Understanding the cues by 
which D. suzukii finds its host fruits, and the mechanisms used 
for invading and feeding thereon, is a key goal in research 
programs aiming to devise novel control systems (Cini et al. 
2012). 

To investigate the evolutionary history behind the switch in 
the reproductive behavior of D. suzukii from rotten to fresh 
fruit, and to better understand how this species established 
itself in western countries at such an impressive speed, we 
sequenced and annotated the genome and transcriptome of 
D. suzukii iron] an Italian Alpine population. On the basis of 
the combined results of phylogenetic and clock analyses, 
comparative genomics, and field observations, we propose a 
paleoecological scenario to explain the peculiar D. suzukii eco- 
logical behavior. 

Materials and Methods 

Specimens and Sequencing 

Inbred D. suzukii lines were established from individuals col- 
lected at approximately 500 m above sea level (asl) in Valsu- 
gana, Trento, Italy, and subsequently maintained in the 
laboratory under standard conditions. Genomic DNA was ex- 
tracted from 1 0 siblings of an F5 inbred generation (five males 
and five females), whereas total RNA was extracted from 1 5 
unrelated individuals at various developmental stages (five 
males and five females adults, three larvae, and two pupae). 
The pooled cDNA library and two short DNA libraries (180 
base pairs [bp] and 300 bp) were sequenced at the GenePool 
Genomics Facility of the University of Edinburgh, using 100 
base paired-end sequencing on the lllumina Hiseq2000 plat- 
form (proportions were 0.2, 0.4, 0.4 for the cDNA, 180 bp 
and 300 bp libraries, respectively). The raw data have been 
deposited in European Nucleotide Archive (study accession 
ERP001893) and the assembly in the ENA under accession 
numbers CAKG01 000001 -CAKG01 061 569. 

RNAseq Assembly 

The RNAseq sequencing generated a total of 35.7 million 100 
base paired reads. Data quality was evaluated with fastQC 
(http://www.bioinformatics.babraham.ac.uk/projects/fastqc/, 
last accessed April 3, 2013) and Tallymer (Kurtz et al. 2008). 
Low-quality positions were trimmed using fastx (http:// 
hannonlab.cshl.edu/fastx_toolkit/, last accessed April 3, 2013) 
with a threshold of 0.3. We assembled the resulting 
30,951,598 read pairs using two distinct approaches. First, 
we used Oases (Schulz et al. 2012) with k-mers ranging from 



25 to 53, obtaining 24,358 contigs (length 1 00-1 5,000 bp). In 
the second approach, we used ABySS (Simpson et al. 2009) 
with k-mer 45 and obtained 140,736 contigs. The two sets 
were merged using cd-hit (Li and Godzik 2006) with an identity 
threshold of 1 00% and eventually superassembled using CAP3 
(Huang and Madan 1 999) using default settings. The final data 
set consisted of 25,810 putative transcripts with lengths vary- 
ing from 50 to 16,500 bp. 

Nuclear Genome Assembly 

Assembly of the nuclear genome was performed using both 
180 bp and 300 bp libraries. The 180 bp library generated 
67,153,264100 base read pairs totaling 14.3 gigabases (Gb) 
and the 300 bp library 51 ,792,255 1 00 base read pairs covering 
1 0.4 Gb. The insert sizes of both libraries were close to expec- 
tations. We initially partitioned the reads depending on 
whether they originated from nuclear, mitochondrial, or 
Wolbachia DNA. Nuclear genome assembly was based on 
reads that were not mappable to a reference database of the 
genomes of five Wolbachia strains {W. ananassae, 
W. melanogaster, W. simulans, W. willinstoni, and wRi) or to 
the D. melanogaster mitochondrial DNA (mtDNA). Mapping 
was performed using Smalt (http://www.sanger.ac.uk/resour 
ces/software/smalt, last accessed April 3, 201 3; see supplemen- 
tary table S1, Supplementary Material online). Reads that 
passed this screening were further cleaned using sickle 
(https://github.com/najoshi/sickle, last accessed April 3, 2013) 
with a quality score cutoff of 25 (phred scale) applied to a sliding 
window of 40 bp. Following this step, reads had an average 
length of 93 bases (standard deviation [SD] = 1 4) and 94 bases 
(SD = 1 5) for the 1 80 and 300 bp libraries, respectively, an av- 
erage quality value of 35, and spanned a total of 20 Gb. 
Assuming similar genome sizes in D. suzukii and 
D. melanogaster, this translates to a coverage of approximately 
168-fold. Genome assembly was carried out using ABySS 
(Simpson et al. 2009) with k-mer size ranging from 48 to 64 
(supplementary table S2, Supplementary Material online). 
After quality assessment of the assemblies, we retained as 
best assembly the one obtained using a k-mer of 64. All contigs 
longer than 1 kb have been submitted to the European 
Nucleotide Archive at EBI web site (http://www.ebi.ac.uk/, 
last accessed April 3, 2013) under ID CAKG01 000001 - 
CAKG01061569. 

Assembly of Drosophilid Mitochondrial Genomes 

All D. suzukii reads that matched the D. melanogaster mtDNA 
were assembled using Geneious (http://www.geneious.com, 
last accessed April 3, 2013), generating 15 contigs, the lon- 
gest of which (14,736 bp) was identified as the nearly com- 
plete D. suzukii mtDNA. This fragment covers all genes but 
lacks the control region, whose length is unknown. To assist 
our phylogenetic analyses, we also reconstructed the partial 
mitochondrial genomes of eight additional Drosophila species 
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(D. biarmipes, D. bipectinata, D. elegans, D. eugracilis, 
D. ficusphila, D. kikkawai, D. rhopaloa, and D. takahashi). 
The draft genomes and transcriptomes of these species 
were kindly made available by the Baylor College of 
Medicine and modENCODE Consortium (https://www.hgsc. 
bcm.edu/content/drosophila-modencode-project, last acces- 
sed April 3, 2013). For each species, we separately compared 
the transcriptome and draft genome against the 
D. melanogaster mtDNA using Basic Local Alignment Search 
Tool (BLAST) (Camacho et al. 2009). We used Geneious to 
assemble each set of contigs identified by BLAST, using the 
D. melanogaster mtDNA as a reference. We compared by eye 
the resulting assemblies to the complete mtDNA genome 
available for 12 other Drosophila species (Drosophila 12 
Genomes Consortium et al. 2007), revealing many putative 
nuclear mitochondrial DNA. Finally, we retained only the 
transcriptome-based assemblies. These contained a large 
number of undetermined sites due to the expected intraspe- 
cific mtDNA polymorphism in the source Drosophila 
populations. 

Repeat Identification 

To the genome of D. suzukii and the other eight Drosophila 
species mentioned earlier, we added the (draft) genomes of 
12 additional Drosophila species (D. melanogaster, 
D. ananassae, D. sechellia, D. simulans, D. yakuba, D. erecta, 
D. pseudobscura, D. persimilis, D. willistoni, D. mojavensis, 
D. virilis, and D. grimshawi; downloaded from http://flybase. 
org, last accessed April 3, 2013). In each genome, we auto- 
matically annotated repeats using RepeatMasker(http://www. 
repeatmasker.org, last accessed April 3, 2013), at default set- 
tings, and then used the Repbase database (Jurka et al. 2005) 
as a reference for de novo identification. We analyzed the 
entire genome without distinguishing between euchromatin 
and heterochromatin partitions, as these information are 
either incomplete or unknown for most of the Drosophila 
species used in this study. We used all fragments irrespective 
of their length, because the D. suzukii genome assembly and 
some of the other draft genomes contained many contigs 
shorter than the 200 kb limit recommended (Drosophila 12 
Genomes Consortium et al. 2007). We quantified the 
presence and size of repeats as the percentage of repeated 
sequences over the draft genome size. This approach has 
the advantage of reducing biases due to the uncertain draft 
genome size of the different species, which may vary due 
to the different assembly strategies and/or genome qual- 
ity levels, and may not reflect the actual genome size. 
To account for this inaccuracy, we further calculated the 
percentage of total repeats using two contrasting and 
conservative estimates of the putative average Drosophila 
genome size (a minimum at 130 Mb and a maximum at 
180 Mb). 



Orthologous Gene Set Identification 

For comparative genomic analyses, we collated data for 2 1 Dro- 
sophila species. We downloaded the latest coding sequences 
(CDS) data sets available for D. melanogaster (release 5.43) and 
D. ananassae (release 1 .3) from FlyBase, as well the masked 
alignments of all single-copy orthologs used in the 12 Droso- 
phila project available from ftp://ftp.flybase.net/genomes/12_ 
species_analysis/clark_eisen/alignments/all_species.guide_ 
tree.longest.cds.tar.gz (last accessed April 3, 201 3) (Drosophila 
1 2 Genomes Consortium et al. 2007). We also downloaded the 
assembled RNA-Seq data of eight modENCODE Drosophila 
species (https://www.hgsc.bcm.edu/content/drosophila-mod- 
encode-project, last accessed April 3, 2013). We identified 
best-hit homologous sequences between the nine RNA-Seq 
and two CDS data sets using pairwise BLASTn (optimized 
using the parameter "-best_hit_overhang 0.1 5"). 

Rates of molecular evolution and tests of positive selection 
were based on the set of orthologous genes identified in 
D. melanogaster, D. biarmipes, D. takahashi, D. suzukii, and 
D. ananassae. To minimize the possibility of spurious matches, 
we filtered matches to exclude any with less than 60% of the 
length of either sequence aligned. We produced two lists of 
putative ortholog sets from this five-species set. In the first 
(" STAR orthologues"), we identified as orthologous genes the 
reciprocal best hits between D. melanogaster and each of 
D. biarmipes, D. takahashi, D. suzukii, and D. ananassae (see 
supplementary fig. S3, Supplementary Material online). Using 
this approach, we identified a total of 2,336 STAR ortholog 
quintuplets. 

The second, more conservative list, included only those 
genes found as reciprocal best hits for all pairwise comparisons 
between the five species ( WEB orthologs; see supplementary 
fig. S3, Supplementary Material online). This data set included 
1,021 WEB ortholog quintets and by definition is a subset of the 
STAR orthologs data set. All sequences within each ortholog set 
were oriented based on the D. melanogaster sequence and 
aligned with MUSCLE (Edgar 2004). We then trimmed partial 
codons at the 5'- and 3'-ends based on the D. melanogaster 
sequence. 

For the ortholog groups used for molecular evolution anal- 
yses, we then extracted the portion of the alignments with 
representation from all taxa. Finally, all alignments were 
realigned using Prank (Loytynoja and Goldman 2008) as im- 
plemented in TranslatorX (Abascal et al. 2010), which aligns 
protein-coding nucleotide sequences based on their corre- 
sponding amino acid translations. 

We removed from these two data sets all orthologs 
sets with alignments shorter than 100 bp. The resulting 
2,263 STAR ortholog quintuplets had a mean length ± standard 
error (SE) of 1 ,335.7 ± 29.0 bp (median = 1 ,092 bp; mode = 
942 bp), corresponding to 69.9 ±29.5% of the 
D. melanogaster gene length. The 1,007 WEB ortholog quintu- 
plets had a mean length ± SE of 1,575.1 ±29. 8bp 
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(median = 1,275 bp; mode = 606 bp), corresponding to 
76.4 ±26.0% of the D. melanogaster gene length. We 
found that the results of our analyses did not change qualita- 
tively when based on STAR orthologs or WEB orthologs. Thus, for 
ease of presentation, and unless specified, we have presented 
only those obtained using the STAR orthologs data set. 



(R Development Core Team 2009). We note that the recipro- 
cal best-hit approach is prone to miss genes with high se- 
quence divergence, including those that underwent 
particularly intense divergent adaptive evolution. Thus, we 
could have missed targets of positive selection among our 
sequenced genes. 



Analyses of the Rate of DNA and Protein Evolution 

Rates of molecular evolution were analyzed for both 
WEB orthologs and STAR orthologs using PAML 4.4 (Yang 
2007). We estimated the rate of nonsynonymous substitution, 
c/ N (leading to amino acid changes), and synonymous substi- 
tution, d s (which should accumulate neutrally), over all 
branches of the phylogenetic tree using the "free-ratio" 
model (M0' [Yang 1998]; model = 1 and NSsites = 0). This 
model allows co = d N /d Sf that is, the level of selective pressure 
experienced by a gene, to vary among branches of the tree. 
Following the results of the phylogenetic analysis (see later), 
the input unrooted tree had the structure (D. melanogaster, 
(D. ananassae,(D. takahashi,(D. biarmipes,D. suzukii)))). We 
then used PAML to test different models of substitution 
rates across coding sites (Yang and Nielsen 2000; Yang 
et al. 2000), with the aim of detecting genes that either 
evolved at a different rate or underwent positive selection 
along the D. suzukii lineage. 

In the first test, we compared models that assumed one or 
more substitution rates across the phylogeny. The first of such 
models is the basic "one-ratio" branch model (M0), which 
assumes a constant co across the phylogeny (model = 0 and 
NSsites = 0). Following the manual recommendations, this 
model was used to get the branch lengths for each gene 
tree, which were then copied into the tree structure file to 
be used with the "branch and site" substitution models. The 
likelihood of the M0 model was compared with that of a 
branch model that assumed two co values, one for the 
D. suzukii branch (the so called foreground branch) and one 
for the rest of the tree (the background branches; model = 2 
and NSsites = 0). Subsequently, the value of twice the differ- 
ence between the two likelihoods (2 AX) was tested using a x 2 
test with 1 degree of freedom. 

The occurrence of positive selection was tested by the 
branch-site test, which aimed at detecting positive selection 
affecting a few sites along the D. suzukii foreground branch. 
In this test (branch-site model A, test 2 (Yang et al. 2005), co 
can vary both among sites in the protein and across branches 
on the tree (model = 2, NSsites = 2). As for the branch model, 
we used tree structures with branch lengths estimated by 
model M0. The null model fixed co 2 =1 (f ix_omega = 1 , 
omega = 1), whereas the positive selection model allowed 
co 2 > 1 (f ix_omega = 0, omega = 1). The likelihood ratio test 
had 1 degree of freedom. To account for multiple testing, we 
also estimated the false discovery rate (FDR) of each test using 
the q value approach (Storey 2002) implemented in R 



Codon Usage Analysis 

We inferred preferred codons and codon usage bias in 
D. melanogaster, D. ananassae, D. takahashi, D. biarmipes, 
and D. suzukii in the genes of the STARorthologs groups 
with more than 30 codons. We estimated codon bias using 
the effective number of codons, Nc (Wright 1990), and the 
frequency of optimal codons, Fop (Ikemura 1981): Stronger 
synonymous codon usage bias is identified by larger Fop 
values and lower Nc values. Both indices were calculated 
using the program CodonW (http://codonw.sourceforge.net, 
last accessed April 3, 2013). Putative optimal (preferred) 
codons were identified as those that were significantly over- 
represented in the 5% of genes with highest and lowest 
usage frequencies (supplementary table S3, Supplementary 
Material online). Base composition affected synonymous 
codon usage, as shown by the strong correlation between 
GC and GC3 (GC in the third codon position) content and 
both Nc and Fop (Spearman correlation, P<10~ 16 ). To 
remove the potential noise due to this correlation, we esti- 
mated a version of the effective number of codons, Nd, which 
accounts for background nucleotide composition (Novembre 
2002). We also used in our analyses the residuals of the 
regression between GC3 content and Fop and Nc. 

Transcriptome and mtDNA Data Sets for Phylogenetic 
Analyses 

We assembled a data set of 91 orthologous genes from the 
transcriptomes of 21 Drosophila species including D. suzukii. 
Strict orthology within the complete set of D. melanogaster 
genes (Drosophila 12 Genomes Consortium et al. 2007) and 
the other 20 transcriptomes was assessed using the reciprocal 
best BLAST hits method. We first identified single copy 
WEBorthologs between D. melanogaster, D. biarmipes, 
D. bipectinata, D. elegans, D. eugracislis, D. ficusphila, 
D. kikkawai, D. rhopaloa, D. takahashi, and D. suzukii. We 
identified the masked alignments of all these WEBorthologs 
in the 12 Drosophila alignments (Drosophila 12 Genomes 
Consortium et al. 2007), thus selecting 97 groups of putative 
orthologs. A few of these were removed after manual inspec- 
tion revealed that they contained incomplete, frame-shifted, 
and/or dubiously assembled sequences, leaving 91 highly re- 
liable ortholog groups. These were aligned using TranslatorX 
and concatenated into a superalignment of 200,475 bp, 
which was further inspected by eye and corrected for the 
correct frame of codons (inclusion of partial stop codons 



748 Genome Biol. Evol. 5(4)745-757. doi:1 0.1 093/gbe/evt034 Advance Access publication March 1 5, 201 3 



Drosophila suzukii Phylogenomics 



GBE 



that altered the frame) and minor errors that escaped the first 
manual inspection. 

We translated this alignment into amino acids and selected 
conserved regions using Gblocks (Castresana 2000) with pa- 
rameters 1 :1 1, 2:17, 3:8, 4:10, and 5:half). We retained 90% 
of the sites, totaling 60,757 amino acids. The final nucleotide 
alignment of 182,271 bp, perfectly corresponding to the 
amino acid alignment, was used for further sequence analyses 
excluding third codon positions. 

A mitochondrial genome alignment was constructed by 
extracting CDS from available and newly assembled (see ear- 
lier) mtDNA. The 1 2 CDS genes were checked for their correct 
codon frame and concatenated. We also excluded third 
codon positions from the mtDNA data set for further se- 
quence analyses. 

Phylogenetic Analyses 

We performed Bayesian and maximum likelihood (ML) analy- 
ses on both the transcriptomic and the mitochondrial genomic 
data sets. For the Bayesian analyses, we used PhyloBayes3 
(Lartillot et al. 2009) setting two independent runs until the 
maxdiff was less than 0.1. We calculated the 50% majority 
rule consensus trees by pulling sampled trees after a burn-in 
that minimized the maxdiff statistic in PhyloBayes3. ML anal- 
yses were performed using Phyml (Guindon et al. 2010) on 
100 nonparametric bootstrapped replicates. In all cases, a dis- 
crete gamma distribution (with four rate categories) was used 
to model among site rate variation. We performed three main 
experiments on both data sets using different data set treat- 
ments and models of replacement: 

1 . ML analyses on nucleotide alignments using all the three- 
codon position and a single nt-general time reversible 
(GTR) model for all codon positions (nucleotides positions 
1+2 + 3, GTR + G, ML in fig. 1). 

2. Bayesian analyses on nucleotide alignments using the CAT 
model after exclusion of the third codon position (nucleo- 
tide positions 1 +2, CAT+G, Bayesian). 

3. Bayesian analyses on the corresponding amino acid align- 
ments using a six-category Dayhoff recoding and the 
CAT + GTR model (amino acids-Dayhoff, CAT + GTR + G, 
Bayesian). 

Molecular Clock Analyses 

We performed two different molecular clock analyses. We 
first used PhyloBayes (Lartillot et al. 2009) on both the tran- 
scriptomic and mitochondrial genomic data sets at the nucle- 
otide level. We employed a CIR process clock model and a 
GTR + G model of replacement on both data sets using the 
fixed tree topology of figure 1 A We constrained four nodes 
as in Prud'homme et al. (2006) using their suggested biogeo- 
graphical calibrations. To account for uncertainty in biogeo- 
graphical constrains, we allowed both minima and maxima to 
be soft, thus allowing the posterior dates to be sampled 



outside the set bounds (Yang and Rannala 2006). We em- 
ployed a root prior of 80 Ma with a permissive SD of 40 
Myr and assumed a birth-death process along all nodes. We 
modeled replacement using CAT and the clock using CIR as in 
Rota-Stabelli, Daley, et al. (2013). In a second approach, we 
used BEAST (Drummond and Rambaut 2007) without con- 
straining internal nodes and the random local clock but only 
a normally distributed root prior centered at 80 Ma with SD 20 
Myr. We assumed the initial mutation rate of 0.0346 
(SD = 0.00281) suggested in Obbard et al. (2012). Because 
mutation rate refers to unconstrained sites, we used only 
the 4-fold degenerate sites of the genomic data set for the 
BEAST analysis. 

Field Monitoring and Trapping 

Field trapping for tests of distribution by altitude were carried 
out between 15 April (week 14) and 31 October (week 43) 
201 1 . Forty sites across Trento Province were chosen repre- 
senting both agricultural and natural ecosystems. Traps were 
deployed on a large-scale altitudinal gradient and assigned to 
four altitudinal ranges (<250m asl [n=10], 250-600 m asl 
[n = 1 0], 600-1 ,000 m asl [n = 1 0], > 1 ,000 m asl [n = 1 0]). At 
each trapping site, we placed, in a shady spot, one plastic 
transparent bottle with multiple small lateral holes (diameter 
between 5 and 10 mm) containing 250 ml of apple cider vin- 
egar as bait. Weekly, traps were checked, insects collected, 
and vinegar replaced. Weekly captures of D. suzukii in each 
trap were averaged per altitudinal range. 

Results 

Genome, Transcriptome, Mitogenome, and Wolbachia 
Sequencing 

We sequenced and assembled a draft genome and transcrip- 
tome of D. suzukii irom an Italian Alpine population. The draft 
genome was sequenced to high depth (an average of 80 x 
coverage) and comprises 49,558 contigs spanning a total of 
160 Mb. The draft transcriptome contains 25,810 unique se- 
quences. Both the size of the genome and its repetitive ele- 
ment content are comparable with that of D. melanogaster 
and other sequenced Drosophila (supplementary fig. S1, Sup- 
plementary Material online). We also assembled the nearly 
complete mitochondrial genome for D. suzukii (~15kb), 
whose size and gene content is similar to that of other se- 
quenced Drosophila. Finally, we extracted and assembled the 
genome of a Wolbachia endosymbiont (wSuzi, 1.3 Mb) har- 
bored by the Italian D. suzukii population. Preliminary analyses 
based on several genes identify wSuzi as closely related to wRi 
from D. simulans Riverside (Klasson et al. 2009). A more de- 
tailed characterization of wSuzi is presented in Siozios et al. 
(2013). 
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Fig. 1. — The evolutionary affinities of Drosophila suzukii and the other Drosophila species inferred from phylogenomic and mitogenomic data. (A) 
Phylogenetic analyses of 91 orthologous nuclear genes (200,475 bp). (B) Phylogenetic analyses of 1 2 mitochondrial genes (1 1 , 1 39 bp). Both data sets support 
an Asian affinity of D. suzukii. Drosophila images from Prud'homme and Gompel, used by permission. 



Molecular Phylogenetics Using Transcriptomic and 
mtDNA Data Sets 

We used data from the D. suzukii genome to conduct a com- 
prehensive multi-locus phylogenetic and dating analysis in the 
context of genome data from 20 additional Drosophila spe- 
cies. We conducted two separate analyses using two distinct 
data sets. 

In the first analysis, we used 91 protein-coding genes ex- 
tracted from the transcriptomes of the 21 species, covering 
more than 200,000 nucleotides (fig. 1A). We analyzed the 
aligned data both as nucleotides, excluding third codon posi- 
tions to exclude likely saturated positions or characters asso- 
ciated with synonymous substitutions, and as amino acid 
sequences. We also employed different phylogenetic frame- 
works (Bayesian and ML) and both homogenous and more 
sophisticated heterogeneous models such as CAT+ GTR on a 
Dayhoff recoded data set (Rota-Stabelli, Lartillot, et al. 2013). 
All analyses converged on a tree that supported a sister 
relationship between the suzukii and takahashii subgroups, 
and D. eugracilis as sister of the melanogaster subgroup 
(fig. D. 



In a second analysis, we reconstructed a phylogeny from 
the mitochondrial genomes of the 21 Drosophila species. We 
assembled nearly complete mitochondrial genomes for eight 
additional Drosophila species for which whole transcriptome 
shotgun data were available. Phylogenetic analyses using the 
same set of experimental procedures used for the transcrip- 
tome data set failed to support most of the findings of the 
genome-derived transcriptome tree (fig. M3). 

Molecular Clocks and (Paleo)Ecological Analyses 

We performed molecular clock analyses using both the tran- 
scriptome and the mtDNA data sets (fig. 2A, see Materials and 
Methods for details). Despite some discrepancies for the ages 
of nodes closer to the root, the two data sets converged in 
supporting divergence of D. suzukii from D. biarmipes in a 
period between 9 and 6 Ma (i.e., the Tortonian). 

To link these clock analyses with the current distribution of 
D. suzukii in Asia, we mapped the current known distribution 
of D. suzukii and their sister species onto a previously compiled 
climatic model of the Asian Tortonian (fig. IB). The current 
distribution of D. suzukii extends over the Tortonian montane 
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Fig. 2. — Molecular timetrees, paleoclimate, and field trapping suggest a montane-temperate origin of Drosophila suzukii. (A) Relaxed clock analyses of 
the Drosophila species using both the nuclear and mitochondrial data sets of figure 1 . Drosophila suzukii is predicted to have diversified toward the late 
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restricted toward the North East. These distributions suggest that D. suzukii speciated from D. biarmipes by adapting to more temperate mountainous 
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temperate forests, whereas D. biarmipes is confined to a more 
equatorial southern habitat. To investigate a possible prefer- 
ence for temperate climate in D. suzukii (see Discussion), we 
monitored the distribution of D. suzukii Italian populations 
along a gradient of altitude over 1 year (fig. 2Q. Drosophila 
suzukii preferentially inhabits higher, more temperate alti- 
tudes, although the majority of human activity and fruit 
sources are concentrated at lower altitudes. 

Reduced Rate of Molecular Evolution and Reduced 
Effective Population Size in D. suzukii 

We explored the patterns of molecular evolution of the 
D. suzukii genome by studying a set of 2,336 orthologous 
genes from five key species carefully chosen to illuminate 
key points in its evolutionary history. Drosophila suzukii 
genes are characterized by a slow rate of molecular evolution 
(fig. 3/\; supplementary table S4, Supplementary Material 
online). Both synonymous (d s ) and nonsynonymous (c/ N ) sub- 
stitutions rates are significantly lower compared with those of 
its sister species D. biarmipes (fig. 3B), consistent with a re- 
duction in substitution rate along the D. suzukii branch. This 
finding is reinforced by a molecular clock analysis that showed 



that D. suzukii has the lowest substitution rate among the 
Drosophila species considered (fig. 3Q. 

We next examined whether in D. suzukii the low substitution 
rate was accompanied by different levels of selective pressure 
compared with its close relative. The level of overall genomic 
selective pressure, as measured by the ratio d N /c/ S/ is on average 
significantly lower in D. suzukii than in D. biarmipes (fig. 3B). 
Interestingly, there is a significantly larger d N /d s in autosomal 
genes of D. suzukii compared with those of D. biarmipes, 
whereas the opposite is true for X-linked genes d N /d s 
(fig. 3B), consistent with a difference in levels of selective pres- 
sure between autosomes and the X chromosome. 

To obtain a broader picture of the evolutionary processes, 
we further analyzed the codon usage in these five species 
(supplementary tables S2 and S3, Supplementary Material 
online). In many organisms, synonymous codons are used 
with different frequencies, leading to codon usage bias. 
Such bias can be under weak selection (\NqS\ « 1 ) and is main- 
tained by the concurrent action of selection, drift, and muta- 
tion. Thus, in principle, codon usage bias should be stronger in 
species with larger effective population size, N e , compared 
with species with lower N e . Both the effective number of 
codons, Nc (Wright 1990), and the frequency of optimal 
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codons, Fop (Ikemura 1981), are significantly different be- 
tween D. suzukii and D. biarmipes (P< 10" 15 ; supplementary 
table S5, Supplementary Material online) and are consistent 
with less codon usage bias in the former. Because GC and 
GC3 (GC in the third codon position) content are significantly 
correlated to codon usage bias measures in D. suzukii and 
D. biarmipes (Spearman's rho > 0.68 for GC and rho < -0.65 
for GC3, P< 10" 15 , for both species), we repeated the com- 
parative analyses while correcting for compositional bias. 
Codon usage bias measures A/c and Fop do not differ signif- 
icantly between D. suzukii and D. biarmipes when correcting 
for GC or GC3 (P> 0.139, for both comparisons). The mod- 
ified version of A/c, which accounts for background nucleotide 
composition, Nd (Novembre 2002), is significantly larger in 
D. suzukii than in D. biarmipes (P=6x 10" 11 ), suggesting 
less codon usage bias in the former. 

The analysis of the rate of molecular evolution at the single- 
gene level revealed only few genes that evolved at different 



rates in the D. suzukii branch compared with the rest of the 
phylogenetic tree (D. melanogaster, D. ananassae, [D. taka- 
hashi, [D. biarmipes, D. suzukii}]) or where branch-site models 
detected the occurrence of positive selection specifically affect- 
ing sites along the D. suzukii branch (tables 1 and 2). 

Discussion 

The Evolutionary Affinities of D. suzukii and the Sister 
Group of the Drosophila Subgroup 

Analyses based on 91 nuclear protein-coding genes (fig. ^A) 
confirmed a sister relationship between the suzukii and taka- 
hashii subgroups (Yang et al. 2012). The melanogaster sub- 
group was found to be closely related to D. eugracilis, a new 
hypothesis of Sophophora evolution that was extremely 
robust to various data set treatments (exclusion of third 
codon positions in nucleotide sequences and translation into 
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Table 1 

Top 10 Genes Identified as Putative Target of Positive Selection along the Drosophila suzukii Branch 



Drosophila 

melanogaster 

Ortholog 



q Value b p 2 c 



co 2 



Function and Phenotype (Sexual, Neuronal, Thorax) 



Cyp4d20 

endos 
Ptp4E 

T48 

CG 15626 
Cyp4aa1 

Osi20 

CG 13397 

yemalpha 

toe 



7x10" 13 8x10" 10 



0.0026 



0.0200 



15.2 Predicted electron carrier activity. Protein with features of Cytochrome P450. 

Expression at moderate levels in the following postembryonic organs or tis- 
sues: adult head, adult eye, adult heart, adult spermathecae, and adult car- 
cass. 

999.0 Predicted sulfonylurea receptor binding activity. Involved in regulation of meiotic 
cell cycle, mitotic spindle organization, oogenesis, water homeostasis, and re- 
sponse to nutrient. Phenotypically relevant in egg, oocyte, and follicle cell. 
0.0048 999.0 Predicted transmembrane receptor protein tyrosine phosphatase activity. 

Involved in motor axon guidance, central nervous system development, and 
open tracheal system development. Phenotypically relevant in ventral nerve 
cord. 

0.0084 999.0 Unknown molecular function. Phenotypically relevant in ventral furrow. 
0.0030 693.7 Unknown molecular and biological function. 

0.0024 999.0 Predicted electron carrier activity. Involved in insecticide catabolic and hormone 
metabolic processes. 

0.0078 747.8 Unknown molecular and biological function. Phenotypically relevant in trichogen 
cell. 

0.0024 200.4 Predicted alpha-N-acetylglucosaminidase activity. Protein domains suggest in- 
volvement in carbohydrate metabolic process. 
0.0022 135.4 DNA binding activity. Involved in female meiosis. Phenotypically relevant in 
oocyte. 

0.0123 70.3 Predicted molecular function is sequence-specific DNA-binding transcription 

factor activity. Involved in compound eye development and negative regula- 
tion of transcription from RNA polymerase II promoter. Phenotypically relevant 
in scutum and scutellum (mesothoracic tergum). 

a Likelihood ratio test probability based on branch-site models of codon evolution, with D. suzukii set as foreground branch. 
Proportion of false positives (FDR) of the test. 

Proportion of sites under positive selection estimated in the foreground branch (D. suzukii) by the branch-site model A. 
d co estimated for the sites under positive selection in the foreground branch (D. suzukii) by the branch-site model A. 



0.00085 
0.00103 

0.00134 
0.00157 
0.00199 

0.00205 

0.00271 

0.00351 

0.00358 



0.27948 
0.29554 

0.33996 
0.36001 
0.39060 

0.39060 

0.47676 

0.52651 

0.52651 



the corresponding amino acid sequences) and experimental 
procedures (use of homogenous and heterogeneous substi- 
tution models in both a Bayesian and ML framework; see 
legend of fig. Not all relationships were well resolved 
using this large data set. The placement of D. ficusphila is 
data set and model dependent, but the use of the sophisti- 
cated CAT + GTR model coupled with Dayhoff recoding of the 
amino acid data set (performed to reduce possible systematic 
errors in phylogenomic analyses; Rota-Stabelli, Lartillot, et al. 
2013) points toward its grouping with D. rhopaloa and 
D. elegans. 

To corroborate our phylogenetic results, we further ana- 
lyzed an mtDNA data set, which failed to support all the find- 
ings of the nuclear gene set tree (fig. 2B). This is most likely 
because of a lack of phylogenetic signal in the mtDNA. Thus, 
the apparently robust bootstrap support (97%) against the 
sister relationship D. eugracilis-melanogaster subgroup 
vanishes when highly saturated third codon positions are ex- 
cluded, or when an amino acid data set was employed, indi- 
cating that signal contradicting the nuclear phylogeny carried 
by mitochondrial genomes is concentrated in unreliable third 



codon positions and/or synonymous substitutions. Overall, our 
phylogenetic analyses reveal that the African melanogaster 
subgroup evolved from within a rapid Asian radiation, identi- 
fying D. eugracilis as a key intermediate species to polarize 
evolutionary traits of the melanogaster subgroup. With 
respect to the placement of D. suzukii in the phylogeny, our 
analyses suggest that this species is the sister taxon of 
D. biarmipes. Drosophila subpulchrella, another little-studied 
fly in the suzukii subgroup, has been reported to have feeding 
behavior similar to that of D. suzukii (Mitsui et al. 2010). This 
species is, however, thought to be most closely related to 
D. pulchrella (hence its name), which is sister to the suzu- 
kii + biarmipes clade (Yang et al. 2012), suggesting indepen- 
dent acquisition of unripe fruit feeding. It will be important to 
explore the relationships of D. subpulchrella using genome- 
scale data. 

Rate of Molecular Evolution Suggests Winter Diapause 

Our results indicate that gene sequences evolve at a signifi- 
cantly lower rate in D. suzukii than in its sister species 
D. biarmpes (fig. 3 and table 1). We hypothesize that the 
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Table 2 

Top 10 Genes Evolving at a Significantly Different Rate along the Drosophila suzukii Branch 



Drosophila P 3 q Value b oo FB c co R d Function and Phenotype (Sex, Neuron, Thorax) 

melanogaster 

Ortholog 



ran 


6x10- 10 


0.000001 


0.0752 


0.0001 


GTP and protein-binding activity. Involved in regulation of meiotic spindle organiza- 
tion, cell cycle, cell shape, cell adhesion, and actin filament organization. 
Phenotypically relevant in photoreceptor cell R7, meiotic spindle, karyosome, om- 
matidium, and pigment cell. 


mtm 


1 x 10" 9 


0.000002 


0.1639 


0.0190 


Phosphatidylinositol-3-phosphatase activity. Involved in mitotic cell cycle, chromo- 
some segregation, and response to wounding. Phenotypically relevant in sessile 
hemocyte and embryonic/larval hemocyte. 


wed 


4x 10" 8 


0.00003 


0.3814 


0.0782 


Unknown molecular function. Involved in ribosome biogenesis, neuroblast prolifera- 
tion, and female germ-line stem cell division. Phenotypically relevant in trichogen 
cell and mesothoracic tergum. 


l(3)72Dn 


9x 10" 8 


0.00005 


0.4781 


0.1330 


Unknown molecular function. Involved in ribosome biogenesis and neurogenesis. 
Phenotypically relevant in mesothoracic tergum. 


CG9135 


7x 10" 7 


0.00031 


0.2774 


0.0150 


Predicted guanyl-nucleotide exchange factor activity. Unknown biological function. 
Phenotypically relevant in mesothoracic tergum. 


CG8562 


3x 1CT 6 


0.00105 


0.3100 


0.0624 


Predicted metal locarboxypeptidase activity. Protein domains suggest involvement in 
proteolysis. 


Oatp33Ea 


0.00001 


0.00175 


0.2131 


0.0556 


Predicted organic anion transmembrane transporter activity. 


Ibk 


0.00001 


0.00175 


0.0001 


0.0435 


Involved in chaeta morphogenesis and oogenesis. 


lid 


0.00001 


0.00175 


0.1231 


0.0421 


Histone acety transferase and demethylase activity. Involved in chromatin organiza- 
tion, and histone acetylation and demethylation. Phenotypically relevant in 
mesothoracic tergum, imaginal disc, and embryonic/larval optic lobe. 


dome 


0.00001 


0.00270 


0.0186 


0.0886 


Transmembrane signaling receptor activity and protein heterodimerization activity. 
Involved in blastoderm segmentation, hindgut morphogenesis, border follicle cell 
migration, long-term memory, JAK-STAT cascade, open tracheal system develop- 
ment, and compound eye morphogenesis. Phenotypically relevant in spiracle, in- 
tegumentary specialization, embryonic hindgut, and compound eye. 



a Likelihood ratio test probability based on branch models of codon evolution, with D. suzukii set as foreground branch. 

Proportion of false positives (FDR) of the test. 

c c£> estimated for the focal (D. suzukii) branch. 

d a> estimated for the rest of the phylogenetic tree. 



low substitution rate of D. suzukii could be due to an idiosyn- 
cratic, low mutation rate, and/or because the species has a 
reduced number of generations per year compared with its 
relatives. The reproductive ecology of the species supports the 
second hypothesis, as in its distributional range D. suzukii re- 
produces only during the warm season and is able to over- 
winter as sexually immature, cold tolerant females (Mitsui 
et al. 2010). Our genomic evidence supports the hypothesis 
that D. suzukii has a winter sexual diapause and thus had a 
reduced number of generations since its last common ances- 
tor with D. biarmipes. 

Reduced Effective Population Size Affects the Efficiency 
of Positive Selection in D. suzukii 

The level of overall genomic selective pressures, as measured 
by the ratio c/ N /c/ S/ is lower in D. suzukii than in D. biarmipes 
(fig. 2B). This result is consistent with more relaxed selection 
along the D. suzukii lineage, possibly because this species has 
a smaller effective population size, N e , than D. biarmipes. A 



reduced N e would allow the fixation of larger number of 
slightly deleterious nonsynonymous mutations (Charlesworth 
2009), as further supported by the observation that D. suzukii 
has a lower frequency of optimal codons and a lower codon 
usage bias than D. biarmipes (supplementary tables S2 and S3, 
Supplementary Material online). The alternative possibility that 
larger c/ N /c/ s values correspond to pervasive positive selection 
in the D. suzukii genome (i.e., increased fixation of beneficial 
mutations) is not supported by our data. First, the fixation of 
favorable alleles in multiple genes would lead to a high dis- 
persion in the distribution of c/ N across the genome (Presgraves 
2005), whereas the variance in c/ N is significantly lower in 
D. suzukii than in D. biarmipes (2.9 x 10" 5 vs. 4.4 x 10" 5 , F 
test P< 10" 15 ). Second, only a few genes were detected as 
significant targets of positive selection in D. suzukii (table 1). 
Thus, the most likely explanation for the low substitution rate 
of D. suzukii is a reduced number of generations per year and 
a smaller N e compared with its relatives. It is unlikely that the 
reduced N e is a direct consequence of the bottleneck 
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associated with a colonization of Europe, as the invasion 
took place only few generations ago (the first record 
dates back to 2008, Cini et al. 2012), and thus the genome- 
wide pattern of substitutions should represent that of the an- 
cestral population. We propose instead that the low substitu- 
tion rate reflects the ecology and evolutionary history of this 
species. The winter diapause, we suggest, explains both the 
low substitution rate and the reduced selection efficiency in 
D. suzukii compared with D. biarmipes. Overwintering dia- 
pause results in recurrent population size bottlenecks, particu- 
larly for males, and thus in a lower effective population size A/ e , 
and lower selection efficiency in removing slightly deleterious 
nonsynonymous mutations, as indicated by its genome-wide 
higher d N /d s . 

The hypothesis that males undergo more severe bottle- 
necks than females is supported by the discrepancy in levels 
of selective pressure between the autosomes and the X chro- 
mosome. As males contain only one copy of X (and two of the 
autosomes), sex-biased population size changes would alter 
relative levels of X-linked and autosomal A/ e , namely by de- 
creasing A/ e of autosomes 2-fold relative to X chromosome in 
males. We indeed observed a significantly larger c/ N /c/ s in au- 
tosomal genes of D. suzukii than of D. biarmipes, whereas in 
X-linked genes, c/ N /c/ s was lower in D. suzukii than in 
D. biarmipes (fig. 3B). If we assume that levels of c/ N /d s are 
a proxy for effective population size, the X/autosome ratio of 
A/ e values, A/ eX /A/ eA , seems to be higher in D. suzukii than in 
D. biarmipes, possibly leading to differences in the efficiency 
of selection on the X and autosomes between the two species. 
One hypothesis to explain this observation is a difference in 
the efficiency of purifying selection in removing recessive del- 
eterious mutations in hemizygous males, a phenomenon 
which can often lead to a faster-X effect (Charlesworth 
et al. 1987; Vicoso and Charlesworth 2009a). Thus, the bot- 
tlenecks associated with the winter diapause of D. suzukii 
could be directly responsible for the relative difference in 
A/ eX /A/ eA between the two species (Pool and Nielsen 2007). 
Other factors that may have affected the differences in the 
ratio A/ex/A/ e A between D. suzukii and D. biarmipes include 
different recombination rates (Vicoso and Charlesworth 
2009b) and variance in male reproductive success due to 
sexual competition among males (Andersson 1994; see 
Mank et al. 201 0 for a review). Additional genetic and behav- 
ioral studies will be necessary to disentangle these forces and 
evaluate their role in the evolution of D. suzukii. 

Paleobiology and Adaptation to Temperate Ecology 

The presence of a winter diapause in D. suzukii may be an 
adaptation that is relevant to the switch in ecology of the 
species. Relaxed clock studies of both nuclear and mitochon- 
drial genomes (fig. 2A) converged on a scenario in which 
D. suzukii diverged from D. biarmipes approximately 9-6 
Ma, toward the end of the Miocene (Tortonian). Climate 



modeling has shown that, during the Tortonian, the ecology 
of region between North India, Indochina, and the Chinese 
coasts (delineated by the yellow line in fig. 2B) was character- 
ized by extended montane temperate forests. Toward the 
present, forests reduced in extent to the North and East and 
alternated with scrublands or tropical forests (Pound et al. 
201 1). The present endemic distribution of D. suzukii extends 
over this region, whereas D. biarmipes is endemic to a more 
equatorial, southern habitat. The distribution of the two spe- 
cies suggests that speciation of D. suzukii was accompanied by 
adaptation to temperate habitats, through the increased uplift 
of the Tibetean plateau and the concomitant intensifi- 
cation of the monsoon regime in the Tortonian (Zachos 
etal. 2001). 

A strong preference for montane temperate climate in cur- 
rent invasive populations is supported by the results of trap- 
ping surveys. Although extensive soft fruit production is 
concentrated below 600 m asl in the surveyed Trentino 
Province of North Italy, the majority of the captures we 
made were at higher altitudes (fig. 2Q. The proposal that 
D. suzukii originated in a temperate, montane ecology is con- 
gruent with its current life habit. In temperate forests, fruit 
production, and thus the availability of rotting fruit, is highly 
seasonal, whereas in the tropics fruiting, and thus the produc- 
tion of rotted food sources, is near-continuous (Willson 1 991 ; 
Ting et al. 2008). For a species occupying a temperate ecosys- 
tem, ovipositing in fresh fruit is required to access food. 
Growing larvae can then accelerate decay and fermentation 
to provide food for the adult stage. Overwintering diapause 
bridges the winter months when fruits are scarce if not 
absent, and low temperatures limit both fermentation and 
fly activity. 

Preadaptations Suggest Invasive Success 

An innate predisposition to temperate climates might also ex- 
plain why D. suzukii was able to invade European and North 
American regions so rapidly. Drosophila melanogaster has also 
invaded temperate climates, but the colonization originated 
from an ancestral tropical African range and was accompa- 
nied by local adaptation (Ometto et al. 2005), whereas invad- 
ing populations of D. suzukii are likely to have already had 
many traits adaptive in the newly occupied range. Genes 
under positive selection (table 1) and those with fast evolu- 
tionary rates (table 2) are good candidates for loci involved in 
such adaptations. We found no evidence for a significant 
overrepresentation of Gene Ontology-defined functional clas- 
ses in these gene sets, but many of the identified genes are 
phenotypically linked (through their D. melanogaster 
orthologs) with biology of the genitalia, the neuronal system 
and (particularly and unexpectedly) with the mesothoracic 
tergum. 

The genetic and neurological bases of the adaptive beha- 
vioral and lifecycle traits of D. suzukii may hold the keys to 
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understanding the origin of a novel behavioral repertoire and 
lead to strategies for control of this pest in western countries. 
Because the current hypothesis of the phylogeny within the 
suzuki i subgroup has not yet been confirmed by whole-ge- 
nome phylogenetics, we cannot exclude the possibility that 
D. subpulchrella is sister to D. suzukii rather than D. pulchrella. 
Under this scenario, some adaptations currently modeled as 
arising within D. suzukii may in fact be shared with 
D. subpulchrella. Genome sequencing of D. subpulchrella 
will clarify this question. Our evolutionary analyses of the 
D. suzukii genome suggest an intriguing causal link between 
adaptation to temperate environments and its particular biol- 
ogy. The genetic bases of adaptation to temperature could be 
a key factor to develop new pesticides or containment strat- 
egies for this pest. 

Supplementary Material 

Supplementary figures S1-S3 and tables S1-S5 are available 
at Genome Biology and Evolution online (http:/A/vww.gbe. 
oxfordjournals.org). 
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